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Within  the  limited  research  literature  on  the  topic,  there  is  considerable  controversy  over  the  usefulness  of 
stereoscopic  TV  displays  for  performing  remote  manipulation  tasks.  Some  investigators  argue  that  a  second  video 
channel  might  just  as  well  be  allocated  to  a  camera  with  an  appropriately  separated  view  of  the  worksite  -  an 
"oilhogonal"  view  to  that  of  the  first  camera.  Other  researchers  argue  that  even  though  operators  tend  to  express 
strong  subjective  preferences  for  stereoscopic  displays,  these  displays  often  do  not  provide  objective 
performance  advantages.  In  this  experiment  we  required  a  group  of  relatively  inexperienced  manipulator  operators 
to  perform  a  complex  and  difficult  line  threading  task  remotely  and  varied  the  visual  displays  available  to  the 
operators  while  performing  this  task.  For  each  video  display  condition  tested,  the  operator  sat  in  a  centered 
position  facing  two  CRTs,  each  providing  a  separate  view  of  the  remote  task  site,  three  combinations  of  video 
display  types  were  tested:  1 )  monoscopic  view  plus  orthogonal  view,  2)  stereoscopic  view  plus  orthogonal  view, 
and  3)  stereoscopic  view  plus  monoscopic  view.  Total  task  completion  times,  manipulative  errors,  and  operator 
gaze  preferences  were  measured  for  each  combination  of  display  types.  Re.sults  show  a  strong  and  consistent 
operator  viewing  preference  for  stereoscopic  displays  as  well  as  substantial  and  statistically  significant 
performance  advantages  for  those  display  combinations  that  provided  a  stereoscopic  view  over  those  that 
provided  only  monoscopic  views. 


1.  INTRODUCTION 

While  a  growing  number  of  researchers  and  operational  users  of  remote  manipulator  systems  contend  that 
stereoscopic  displays  provide  consistent  and  substantial  performance  advantages  over  corresponding 
monoscopic  displays  for  many  important  real-world  applications,  others  are  of  the  opinion  that  a  second  video 
channel  might  be  just  as  effectively  allocated  to  a  camera  with  an  appropriately  offset  point  of  view  of  the  work  site, 
a  so-called  "orthogonal”  view  to  that  of  the  first  camera.  Measures  of  subjective  preference  between  stereoscopic 
and  monoscopic  displays  of  the  same  remote  scene  have  generally  not  been  conducted  by  simultaneous 
comparison  of  carefully  matched  alternative  displays  that  have  been  counterbalanced  for  order  of  presentation  or 
position  of  presentation  over  a  series  of  testing  sessions.  Previous  comparisons  between  stereoscopic  and 
monoscopic  displays  have  relied  almost  exclusively  on  operators’  verbal  reports  of  which  display  they  liked  best 
subsequent  to  testing  as  the  sole  dependent  measure  of  display  preference.  One  notable  exception  that 
involved  the  use  of  objective  behavioral  observation  techniques  to  measure  operator  display  preferences  is  a 
study  undertaken  by  researchers  at  the  Atomic  Energy  Research  Establishment  at  Harwell  in  the  United  Kingdom 
and  briefly  reported  on  in  at  last  year's  SPIE  conference  on  stereoscopic  displays  and  applications  (1 ,2].  The 
Harwell  group  found  that  when  operators  were  provided  a  simultaneous  choice  between  viewing  a  stereoscopic 
display  and  an  "orthogonal"  monoscopic  display  while  performing  a  remote  pick-and-place  manipulation  task, 
operators  were  observed  to  show  a  strong  preference  for  the  stereoscopic  display  as  evidenced  by  which  display 
they  were  observed  viewing  while  performing  the  task.  In  conducting  the  experiment  reported  here,  we  hoped  to 
replicate  the  Harwell  findings  with  a  different,  but  commensurately  challenging  remote  manipulation  task.  We  used 
readily  observable  head  aiming  behaviors  as  an  objective  measure  of  viewing  preference,  while  at  the  same  time 
measuring  task  comoletion  tinges  as  well  as  inadvertant  collisions  of  the  manipulator  arm  with  the  task  board. 
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In  total,  six  display  combinations  were  presented  to  each  operator  over  the  course  of  the  experiment. 
These  combinations  are  summarized  below  in  Table  1 . 


TABLE  1 .  Six  display  combinations  tested. 


LEFT-SIDE  DISPLAY  POSITION 

A)  Monoscopic  (Camera  1 ) 

B)  Stereoscopic  (Cameras  1  &2) 

C)  Stereoscopic  (Cameras  1&2) 

D)  Orthogonal  (Camera  3) 

E)  Monoscopic  (Camera  1 ) 

F)  Orthogonal  (Camera  3) 


RIGHT-SIDE  DISPLAY  POSITION 
Orthogonal  (Camera  3) 
Monoscopic  (Camera  1) 
Orthogonal  (Camera  3) 
Monoscopic  (Camera  1) 
Stereoscopic  (Cameras  1  &2) 
Stereoscopic  (Cameras  1&2) 


The  general  experimental  issues  that  we  sought  to  address  were  as  follows:  1)  for  those  situations 
in  which  a  manipulator  operator  is  provided  two  separate  televised  views  of  a  remote  work  site,  which 
combination(s)  of  displays  support(s)  efficient  performance  ( i.e.,  fastest  task  completion  times  and  fewest 
undesirable  collisions  with  the  taskboard)?,  and  2)  which  display  type  is  preferred  -  monoscopic,  orthogonal 
monoscopic,  or  stereoscopic  -  when  a  choice  must  be  made  between  two  alternatives? 

When  the  operator  is  required  to  perform  a  task  tnat  requires  precision  alignment  in  depth  and  orientation, 
as  was  the  case  for  the  line  threading  task  selected  for  this  experiment,  we  would  expect  display  combinations  that 
provide  more  accurate  depth  and  3-D  orientation  information  to  the  operator  to  support  more  efficient  task 
performance.  Additionally,  we  would  expect  that  a  display  from  which  depth  and  orientation  information  could  be 
accurately  and  more  readily  interpreted  would  be  viewed  during  a  greater  proportion  of  total  task  time  than  an 
alternate  display.  Thus,  we  hypothesized  the  following  outcomes,  given  the  range  of  display  combinations  that 
were  tested: 

1 )  a  stereoscopic  display  would  be  clearly  superior  to  a  "simple"  monoscopic  display  when  that  display  is 
identical  to  one  of  the  two  viewpoints  comprising  the  stereoscopic  display,  because  no  additional  depth  or 
orientation  information  is  provided  by  such  a  monoscopic  display.  That  is,  it  is  redundant  to  the  stereoscopic 
display.  With  reference  to  Table  1 ,  combinations  B,C,E,  and  F  which  include  a  stereoscopic  display  should 
provide  superior  performance  over  combinations  A  and  D  which  do  not  include  a  stereoscopic  display. 

2)  a  stereoscopic  display  would  be  preferred  when  directly  pitted  against  an  orthogonal  monoscopic  display, 
because  depth  and  orientation  information  is  less  ambiguous  and  sensitive  to  viewpoint  [1-3]  than  it  is  for 
the  orthogonal  display.  However,  since  the  orthogonal  display  would  provide  useful  depth  and  orientation 
information  for  some  limited  set  of  total  task,  we  would  expect  the  observed  superiority  of  the  stereoscopic 
display  to  be  somewhat  less  pronounced  than  in  hypothesized  outcome  1 ,  above.  In  more  specific  terms, 
we  would  expect  combinations  C  and  F  to  be  preferred  to  B  and  E. 

3)  Since  both  provide  ambiguous  2-D  representations  of  the  complex  3-D  task,  little  difference  in  viewing 
preference  would  be  expected  between  the  monoscopic  display  and  orthogonal  monoscopic  display  (i.e., 
between  the  two  displays  used  in  combinations  A  and  D). 

4)  Given  the  close  contiguity  of  the  two  display  screens  used  in  this  experiment,  we  would  expect  little  or  no 
effect  for  side  of  presentation  of  a  particular  display  type,  though  we  do  include  it  as  a  counterbalanced 
factor  in  the  experimental  design  since  spatial  compatibility  of  control  inputs  and  displays  has  clearly  been 
shown  to  exert  an  influence  on  performance. 


2.  METHODS 

2.1  Operators 

bix  experimental  operators  were  tested.  All  were  practiced  (i.e.,  familiar  with  both  the  manipulator  and  the 
line  threading  task  under  direct  viewi.ng  conditions,  though  not  practiced  with  the  specific  display  configurations 
used  in  the  experiment).  Amount  of  experience  using  the  manipulator  (  a  CRL  Model  G  master-slave  unit)  prior  to 
commencement  of  data  collection  varied  widely  among  operators.  One  operator  had  no  more  than  two  hours  prior 
experience  using  the  manipulator,  three  operators  had  from  6  to  10  hours  prior  experience,  and  the  remaining  2 
operators  each  had  in  excess  of  40  hours  prior  experience.  All  operators  were  screened  for  normal  visual  acuitv 
and  stereoacuity  using  a  bafte'/  nf  siandardizerl  visio"  [4i  anH  mnHnm  cioi  ste'^eograms.  With  tne  exception 
of  one  individual,  operators  were  uninformed  as  to  the  purpose  of  the  experiment,  most  particularly  to  the  fact  that 
their  viewing  preferences  were  a  major  subject  of  interest  in  the  experiment.  Fortunately,  a  comparison  of  data 
collected  from  the  one  "informed"  operator  with  data  collected  from  the  other  "uninformed"  operators  did  not 
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suggest  anv  substantia!  deviations,  so  his  data  was  included  in  the  final  analyses  presented  below.  All  operators 
participated  in  the  study  during  normal  working  hours  and  none  were  paid  any  amount  over  and  above  their  normal 
hourly  wages  for  participating. 

2.2  Display  Interface 

A  side-by-side  pair  of  48  cm  (19  inch)  diagonal  black  and  white  TV  monitors  (Panasonic  Model  WV5490) 
provided  the  operator's  only  views  of  the  task  since  a  direct  view  of  the  taskboard  was  completely  blocked  by  an 
opaque  curtain.  The  monitors  were  centered  approximately  1 .5  meters  in  front  of  the  operator,  in  the  arra.igement 
diagrammed  below  in  Figure  1 .  A  60Hz  field  sequential  technique  that  provided  a  30  Hz  monocular  alteration  rate 
was  used  for  stereoscopic  display  [5].  Brightness  and  contrast  levels  of  all  three  display  types  tested  (i.e., 
monoscopic,  orthogonal  monoscopic,  and  stereoscopic)  were  equalized  and  held  constant  throughout  the  entire 
experiment.  Under  all  test  conditions,  even  those  not  involving  use  of  a  stereoscopic  display  ( i.e.,  A  and  D  in 
Table  1),  operators  wore  a  pair  of  light  shutter  glasses  for  stereo  channel  separation.  These  glasses  restricted  the 
operator's  binocular  field  of  view  to  a  sector  approximately  35'=’  horizontal  by  1 5°  vertical.  Consequently,  as  is 
diagrammed  in  Figure  1 ,  the  operator  was  not  able  to  center  his  view  of  one  display  screen  while  simultaneously 
viewing  the  other  screen.  Thus,  he  was  forced  to  use  easily  observable  panning  head  rrwvements  rather  than  eye 
movements  to  look  from  one  display  to  the  other.  In  addition  to  the  shutter'  glasses,  the  operator  wore  a 
lightweight  set  of  stereo  headphones  through  which  broadband,  "pink"  noise  of  constant  average  intensity  was 
played.  This  had  the  effect  of  masking  any  audio  cues  that  an  operator  might  use  in  performing  the  threading  task. 
Overall  brightness  of  the  displays  and  the  ambient  lighting  in  the  operator's  control  station  area  were  held  low  to 
minimize  the  sensation  of  flicker  experienced  with  60Hz  field  sequential  stereo  display  systems  used  at  higher 
light  levels.  Magnification  for  all  three  display  types  was  matched  at  a  close  approximation  of  0.7. 


Figure  1 .  Top-view  Diagram  of  Laboratory  Layout  Used  in  This  Experiment. 
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2.3  Camera  Configuration 

A  r.'.iniature  pair  of  high-quality,  black  and  white  video  cameras  (Pulnix  Model  TM-540)  comprised  the 
stereo  camera  head.  An  interaxial  separation  of  65  mm  was  used  Cameras  were  converged  by  the  iimp'"  "toe-ii  i 
"lethod  to  a  pcin+  coTesponding  to  the  center  of  the  circular-shaped  iaskboard,  some  1.5  meters  in  front  of  the 
camera  pair.  As  can  be  seen  in  Figure  1 ,  the  orthogonal  monoscopic  camera  was  also  positioned  1 .5  meters  from 
taskboard  center  but  it  was  offset  from  the  stereo  camera  head  by  45®  relative  to  the  reference  taskboard 
orientation.  In  setting  up  ail  three  cameras,  care  was  taken  to  maintain  constant  position,  focus,  aperture  setting, 
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and  convergence  throughout  the  entire  experiment.  Special  care  was  given  to  the  problem  of  eliminating  vertical 
disparities  in  stereoscopic  images,  though  this  was  not  entirely  possible  given  the  camera  convergence  technique 
that  was  empioyed[6]. 

2.4  The  Line  Feeder  Task 

The  manipulative  task  was  a  complex  one  that  involved  threading  a  line  thiough  a  predesinnated  series  of 
3  obscured  eyelets  that  were  arranged  in  varying  depths  and  orientations  on  a  flat  circular  taskboard  backplate. 
(See  Figure  2).  A  graduated  series  of  "stalks"  of  varying  lengths  (i.e,  8.9,  10.2,  11.4,  12.7,  15.2,  16.5,  17.8,  19, 
and  20.3  cm)  offset  the  eyelets  to  varying  depths  from  the  taskboard  backplate.  All  eyelets  were  identical, 
consisting  of  a  2.5  cm  long,  hollow  ring  with  an  inner  diameter  of  4.5  cm  through  which  the  operator  threaoeo  a 
3.2  cm  rubber  coated  cylinder  with  an  attached  line  (i.e.,  1  cm  or  3/8  inch  diameter,  braided  nylon  rope).  Thus,  the 
physical  tolerance  for  threading  the  line  was  held  constant  for  all  8  eyelets.  Each  eyelet  contained  a  small  IR 
emitter/receiver  circuit  for  automated  recording  of  the  time  required  for  line  threading.  In  the  taskboard  area  of  the 
test  room  high  intensity  diffuse  lighting  from  multiple  light  sources  was  used  to  reduce  the  usefulness  of  shadow 
cues  produced  when  the  manipulator  and  line  approached  any  part  of  the  taskboard  [see  7,  page  23  for  further 
details].  In  order  to  eliminate,  or  at  least  minimize,  the  usefulness  of  the  visual  cue  of  relative  size  of  eyelets, 
eyelet  covers  were  used.  As  stated  earlier,  these  obscured  an  operator's  televised  view  of  the  eyelet.  Eyelet 
covers  were  coated  with  black  flocked  paper  to  reduce  the  effectiveness  of  shadow  cues  on  approach,  and 
attached  arrows  signalled  the  direction  in  which  the  line  was  to  be  threaded.  They  were  ellipsoidal  in  shape  and 
were  varied  in  size  so  that  their  relative  size  in  the  displays  was  not  a  reliable  cue  to  depth  of  the  eyelet  they 
obscured.  Covers  were  attached  to  eyelets  with  strips  of  magnetic  material.  If  an  eyelet  was  not  touched  or 
othervrise  disturbed  by  the  manipulator  or  the  line,  it  remained  attached  to  the  eyelet.  However,  when  physically 
disturbed  during  the  process  of  line  threading,  it  would  detach  and  fall  to  the  floor.  Operators  were  instructed  to 
avoid  detaching  eyelet  covers  and  used  their  detachment  as  a  rough  index  of  inadvertant  collisions  or  disturbance 
of  the  taskboard  during  the  process  of  line  threading.  To  further  complicate  the  task  for  the  operator  and  force  him 
to  rely  on  visual  feedback  provided  through  the  video  display  interface,  we  changed  the  orientation  of  the 
laskboard  on  a  trial-by-trial  basis  in  quasi-random  fashion. 


Figure  2.  Views  of  the  Line  Threading  Taskboard  Used  in  this  Experiment. 

Close-up  view  of  a  single  eyelet  with  Frontal  view  of  taskboard.  Note  the  line  with 

attached  eyelet  cover  rubber  coated  cylinder  at  its  end 


2.5  Measurement  of  Gaze  Preference 

As  illustrated  in  Figure  1 ,  a  video  camera  was  positic  'ad  unobtrusively  above  and  behind  the  operator 
during  testing.  Unbeknownst  to  all  but  one  of  the  operators, ;  3  camera  provided  a  video  record  that  couid 
subsequently  be  used  to  determine  which  display  the  operator  viewed  while  performing  the  manipulation  task.  As 
an  aid  to  the  scorer  of  the  .idea  recording,  a  small  white  aim  indicator  was  attached  to  the  crown  of  the  set  of 
headphones  worn  by  the  operator.  This  aim  indicator  provided  an  unambiguous  indication  of  the  direction  in 
which  the  operator's  head  was  pointed  at  any  given  instant  during  testing.  To  further  aid  the  scorer,  identifying  text 
information  was  overlayed  in  the  video  that  uniquely  identified  each  trial  for  each  operator  on  each  day  of  testing. 
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Additionally,  a  video  screen  splitter  was  used  to  continuously  log,  in  the  same  video  recording,  a  view  of  the 
taskboard  for  verification  of  collision  errors.  While  scoring  the  video  records,  the  scorer  was  unaware  of  the 
particular  viewing  conditions  being  tested  during  a  particular  session. 

2.6  Testing  Procedure 

Each  of  the  6  operators  was  tested  during  6  separate  testing  sessions  for  an  experiment  total  of  36 
sessions.  Order  of  presentation  of  the  6  display  configurations  listed  in  Table  1  was  counterbalanced  across  the  6 
operators  to  minimize  any  systematic  effects  on  results.  Each  session  required  approximately  one  hour  to 
complete.  During  each  session  an  operator  was  required  to  perform  1 2  discrete  threading  trials.  Each  trial 
consisted  of  ihreadirig  the  iine  through  a  predesignated  series  of  8  eyelets.  Trial  times  were  recorded 
automatically  by  a  controlling  computer.  Operators  were  instructed  to  emphasize  precision  of  operation  by 
minimizing  inadvertant  collisions  with  the  taskboard  that  would  result  in  detaching  eyelet  covers.  Prior  to  each  trial, 
the  taskboard  was  moved  to  a  different  orientation  relative  to  the  cameras  and  manipulator  in  the  manner 
previously  used  in  our  laboratory  [8].  In  this  way,  orientation  of  the  taskboard  was  randomized  on  a  trial  by  trial 
basis,  forcing  the  operated  to  rely  more  heavily  on  immediate  visual  cues  rather  than  on  learned  position  and 
orientation  of  the  taskboard  over  an  extended  series  of  repetitive  trials.  The  experimenter  counted,  recorded,  and 
replaced  any  detached  eyelet  covers  at  the  conclusion  of  each  trial.  Following  eacn  trial  the  operator  was  informed 
of  total  trial  time  and  number  of  eyelet  covers  detached  during  that  trial. 


3.  RESULTS 

3.1  Task  Completion  Times  and  Inadvertant  Collisions 

A  three-way  repeated  measures  analysis  of  variance  (ANOVA)  was  njn  on  Total  Task  Completion  Times 
with  Display  Combination,  Position  of  Display,  and  Trial  Repetitions  serving  as  independent  factors  in  the  analysis 
Of  the  three  factors  analysed  for,  only  Display  Combination  was  found  to  exert  a  significant  effect  on  Task 
Completion  Times  [omnibus  F-value  =  14.6,  df  =  2,  Greenhouse-Geisser  (G-G)  corrected  p  <.02].  No  interactive 
effects  in  the  analysis  were  found  to  be  statistically  significant.  The  effects  of  Display  Combination  on  Task 
Completion  Times  in  seconds  are  graphed  below  in  Figure  3.  As  inspection  of  Figure  3  reveals,  there  was  an 
approximate  26%  reduction  in  average  Task  Completion  Time  for  the  Stereo-Ortho  display  combination  versus  the 
Ortho-Mono  combination,  and  this  mean  difference  was  statistically  significant  [  F  =  19.4,  df  =  1 ,  G-G  corrected  p  < 
.02].  Additionally,  there  was  an  approximate  29%  reduction  in  average  Task  Completion  Time  for  the  Stereo-Mono 
combination  versus  the  Ortho-Mono  combination.  This  mean  difference  was  also  found  to  be  statistically 
significant  [  F  =  24.14,  df  =  1,  G-G  corrected  p<  .01). 

Figure  3.  Effects  of  Display  Combination  on  Line  Figure  4.  Effects  of  Display  Combination  on 

Threading  Task  Completion  Times.  Inadvertant  Collisions  with  the  Line  Feeder 

Taskboard. 


DISPLAY  COMBINATION  DISPLAY  COMBINATION 

NOTE:  The  error  bars  graphed  in  these  and  subsequent  figures  represent  standard  errors  (S.E.).  Number 
of  trials  averaged  for  each  value  plotted  in  Figures  3  and  4  was  96  ( i.e.,  n_obs  =  96). 

A  similar  analysis  of  variance  was  run  on  Inadvertant  Collision  Errors.  Like  the  analysis  of  Task  Completion 
Times,  this  analysis  revealed  that  only  the  Display  Combination  effect  significantly  influenced  error  rates.  The 


Preprintof  Paper  in  SPIE  Proceedings,  Vol.  1457, 1991. 


Page  6 


effects  of  Display  Combination  on  errors  (  actually,  average  number  of  eyelet  covers  dov.med  per  trial)  are  graphed 
above  in  Figure  4.  Inspection  of  Figure  4  reveals  that  there  was  an  approximate  32%  reduction  in  error  rate  for  the 
Stereo-Ortho  display  combination  versus  the  Ortho-Mono  combination  [  F  =  9.22,  df  =  1,  G-G  corrected  p  <  .02). 
There  was  an  approximate  44%  reduction  in  average  error  rates  for  the  Stereo-Mono  combination  versus  the 
Ortho-Mono  combination  [  F  =  1 1 .93,  df  =  1 ,  G-G  corrected  p  <  .01].  No  significant  difference  was  found  between 
the  Stereo-Mono  and  Stereo-Ortho  combinations. 

3.2  Gaze  Preferences 

A  chi-square  test  was  run  to  test  the  hypothesis  that  when  given  a  choice  between  a  stereoscopic  display 
and  one  of  the  two  monoscopic  display  types  that  were  presented  ( i.e.,  configurations  B,C,E,  and  F  in  Table  1), 
operators  would  be  equally  likely  to  view  either  stereoscopic  or  monoscopic  displays.  !f  the  hypothesis  were  true, 
when  averaged  over  the  series  of  sessions  run.  this  would  result  in  viewing  times  approximating  50%  for  both 
stereoscopic  and  monoscopicdisplay  types  used  within  a  session.  Results  of  the  analysis  [chi-square  =  232.65,  df 
=  5,  p  <  .001]  showed  a  very  strong,  consister  md  statistically  significant  preference  for  viewing  stereoscopic 
displays  over  either  of  the  monoscopic  display  ,  pes  tested. 

A  three-factor,  repeated-measures  ANOVA  was  run  on  stereoscopic  display  preference  scores 
(proportion  of  time  spent  viewing  the  stereoscopic  display  during  a  trial)  with  Display  Combination  (Stereo-Mono  or 
Stereo-Ortho),  Position  of  the  Display  (  Left  or  Right),  and  Trial  Number  serving  as  main  effects  in  the  analysis. 
None  of  the  simple  or  interactive  effects  in  the  experiment  were  statistically  significant.  However,  an  interesting 
trend  was  found  for  the  Display  Combination  effect  and  this  is  plotted  below  in  Figure  5.  Inspecting  Figure  5,  one 
can  see  the  high  proportion  of  time  that  operators  viewed  the  stereoscopic  display  when  either  the  simple 
monoscopic  (  98.3%  stereo  preference  )  or  the  orthogonal  monoscopic  display  (  92.1%  stereo  preference)  vied 
for  their  attention.  One  can  also  see  a  slight,  statistically  non-significant  ( p  =  .096)  tendency  for  operators  to  spend 
approximately  6%  more  of  their  time  viewing  the  orthogonal  display  when  it  competes  for  their  attention  than  they 
spend  viewing  the  monoscopic  display  when  it  competes. 


A  chi-square  test  was  run  to  test  the  hypothesis  that  the  two  monoscopic  display  types  were  equally 
preferred  in  those  sessions  where  they  vied  for  the  operator's  attention  (i.e.,  combinations  A  and  D).  The  attentive 
reader  will  recall  that  we  did  not  expect  to  find  a  difference  in  gaze  prefeience  for  this  viewing  situation,  since  both 
views  provided  similar  information  regarding  the  relative  depth  and  orientation  of  the  eyelets  that  had  to  be 
threaded.  It  should  be  noted,  however,  that  the  orthogonal  view  did  enable  the  operator  to  see  around  the  eyelet 
covers  in  many  instances  and  this  may  have  well  proven  advantageous.  In  view  of  this,  the  results  of  the  chi-square 
test  were  suiprising  because  they  revealed  a  sutetantial  preference  for  the  simple  monoscopic  view  over  the 
orthogonal  view  [  chi-square  =  164.61 ,  df  =  1 ,  p  <  .01).  This  preference  for  simple  monoscopic  over  the 
orthogonal  monoscopic  view  is  graphed  in  Figure  6  which  also  illustrates  that  there  was  no  significant  effect  for  the 
position  in  which  the  monoscopic  display  types  were  presented. 


Figure  5.  Stereoscopic  Display  Viewing  Preference 
as  a  Function  of  Display  Configuration. 
n_obs  =  120. 


Figure  6.  Display  Preference  for  the  Simple 

Monoscopic  Display  Over  an  Orthogonal 
View  as  a  Function  of  Display  Position, 
n  obs  =  72. 
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4.  DISCUSSION 

Overall,  the  pattern  of  results  found  in  this  experiment  strongly  parallel  the  previously  published  findings 
of  the  AERE  -  Harwell  research  group  in  the  UK  [1 ,2],  While  their  previous  experiment  showed  an  overall  -12% 
task  completion  time  advantage  for  a  stereoscopic  display  over  a  monoscopic  display,  the  present  study  showed 
even  more  pronounced  effects  of  -26%  and  -29%  advantages  in  task  completion  times  over  orthogonal  and 
simple  monoscopic  displays,  respectively.  The  discrepancy  between  the  two  remote  manipulators  and  tasks  used 
in  the  previous  experiment  and  this  experiment  might  well  account  for  this  sizeable  discrepancy  between  the 
observed  stereoscopic  task  time  advantages.  To  elaborate,  the  manipulator  used  in  the  earlier  experiment  may 
have  imposed  greater  limitations  on  performance  times  than  the  direct-banded,  "through-the-wall",  master-slave 
device  with  force  feedback  that  was  used  in  this  experiment.  In  addition,  the  line  threading  task  used  here 
probably  demanded  more  precision  for  successful  completion  than  the  "toast  rack"  task  employed  in  the  earlier 
experiment.  Moreover,  to  a  large  extent,  the  visual  components  of  the  task  required  of  operators  in  this 
experiment  were  designed  to  enhance  the  value  of  stereoscopic  cues  by  reducing  the  availability  of  other  cues  to 
depth  and  orientation  (  e.g.,  relative  size,  well-defined  shadows ).  This  experiment  also  incorporated  a  simple 
measure  of  error,  of  inadvertent  collision  with  the  taskboard,  in  the  form  of  the  detachable  eyelet  covers.  The  error 
data  collected  here  also  demonstrated  a  strong  advantage  for  the  use  of  stereoscopic  displays  over  simple 
monoscopic  displays  [39%  advantage]  as  well  as  orthogonal  monoscopic  displays  [44%  advantage].  Most  striking 
was  the  concordance  between  the  previous  experiment  and  this  experiment  with  respect  to  gaze  preferences. 
Whereas,  Dumbreck,  et  al.  [2,  p.  200]  observed  an  overall  stereoscopic  display  gaze  preference  of  94.6%,  we 
observed  an  overall  95.2%  preference.  Taken  together,  the  results  reported  here  offer  strong  support  to  the 
general  conclusion  that  stereoscopic  displays  are  advantageous  for  remote  performance  of  complex,  three- 
dimensional  manipulation  tasks.  In  this  experiment,  when  compared  to  both  simple  and  orthogonal  rrKinoscopic 
displays,  a  stereoscopic  display  significantly  and  substantially  lowered  times  required  for  overall  task  completion, 
improved  precision  of  operations,  reduced  inadvertant  collisions  with  the  taskboard,  and  was  objectively  observed 
to  be  very  strongly  preferred  when  operators  were  given  an  immediate  choice  between  viewing  either  a 
stereoscopic  display  or  a  monoscopic  one. 

Our  initial  hypothesis  that  position  of  the  displays  used  in  this  experiment  would  exert  little  effect  on  the 
outcomes  of  the  performance  measures  was  supported  by  the  results.  No  significant  main  or  interactive  effects 
were  found  for  this  Display  Position  effect  in  any  of  the  analyses  conducted.  This  conclusion  must,  however,  be 
qualified  by  two  provisions.  First,  the  displays  used  for  this  experiment  presented  only  a  rather  modest  field  of 
view  and  were  centered  directly  in  front  of  the  operator  at  eye  level.  Secondly,  the  video  displays  were  also 
distanced  just  outside  the  zone  of  motion  for  the  operator  and  manipulator  master,  and  therefor?  did  not 
physically  constrain  the  operator's  physical  control  inputs.  Both  these  provisions  are  important  considerations  in 
designing  any  control  interface  for  remote  manipulation. 

Our  initial  hypothesis  of  no  significant  difference  between  the  simple  and  orthogonal  monoscopic  views 
was  not  supported  by  the  results  of  this  experiment.  A  strong,  consistent  preference  was  found  for  the  "simple" 
monoscopic  view  over  the  orthogonal  monoscopic  view.  Exactly  why  this  result  occurred  cannot  be  determined 
from  the  results  of  this  study,  but  may  have  been  due  to  a  closer,  or  more  "natural",  spatial  correspondence 
between  control  inputs  and  movement  feedback  provided  to  the  operator  by  the  video  display  [9].  Because  the 
operator’s  viewpoint  (i.e.,  distance,  angle  of  regard,  field  of  view  relative  to  the  display  surfaces)  corresponded 
very  closely  to  that  of  the  camera  providing  the  "simple"  monoscopic  view  of  the  taskboard,  the  visual  feedback 
provided  the  operator  was  very  similar  to  that  available  under  normal,  direct  viewing  conditions.  In  other  words,  the 
operator’s  visual-motor  frame  of  reference  remained  largely  unchanged  with  only  minor  adjustments  needing  to  be 
made  for  distortions  introduced  by  video  viewing  with  a  fixed  position  camera.  For  example,  under  the  simple 
monoscopic  viewing  condition,  when  the  operator  moved  the  manipulator  grip  to  the  left  5  °,  the  image  of  the 
manipulator  gripper  in  the  display  moved  left  by  an  amount  approximating  and  directly  proportional  to  5°.  Using  the 
simple  monoscopic  view,  when  the  operator  moved  the  manipulator  grip  in  depth,  closer  to  or  further  away  from 
the  taskboard  arid  its  eyelets,  the  relative  size  of  the  gripper  changed  accordingly  in  his  view  of  the  scene,  with 
some  vertical  movement  but  little  or  no  lateral  rrwvement  on  the  display  screen.  When  an  orthogonal  monoscopic 
view  was  provided,  however,  the  operator  was  required  to  perform  a  shift  in  his  visual-motor  frame  of  reference  to 
perform  the  task.  A  manipulator  grip  movement  to  the  left  5°  was  seen  as  a  much  smaller  angular  shift  to  the  left  in 
the  orthogonal  view.  More  importantly,  though,  a  movement  of  the  arm  in  depth  toward  or  away  from  the  taskboard 
resulted  in  a  sizeable  angular  shift  to  the  left  or  right  on  the  display  screen.  The  added  mental  burden  of 
transforming  one's  visual-motor  frame  of  reference  may  have  discouraged  operators  from  using  this  display 
whenever  it  competed  with  the  simple  monoscopic  view.  An  alternate  explanation  for  the  results  holds  that  some 
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component  of  the  observed  preference  for  the  simple  monoscopic  display  may  have  been  the  result  of  operant 
conditioning.  Since  the  viewpoint  provided  by  the  simple  monoscopic  display  was  very  similar  to  ( in  fact  identical 
to  one-half  of )  the  highly  preferred  otereoscopic  display.  Over  the  course  of  testing,  operators  would  build  up 
greater  familiarity  with  that  viewpoint  and  perhaps  come  to  associate  it  with  more  rewarding  performance.  A 
definitive  answer  to  these  speculations  is  beyond  the  scope  of  this  experiment  and  this  brief  report.  Perhaps  more 
work  will  be  performed  in  the  future  to  clarify  these  issues  and  gain  a  more  precise  and  satisfying  understanoing  of 
the  powerful  effects  of  viewpoint  and  visual-motor  corresponaance  on  remote  manipulation  and  remote 
operations  'n  general. 
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